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Abstract. We define and construct efficient depth-universal and almost- 
size-universal quantum circuits. Such circuits can be viewed as general- 
purpose simulators for central classes of quantum circuits and can be used 
to capture the computational power of the circuit class being simulated. 
For depth we construct universal circuits whose depth is the same order 
as the circuits being simulated. For size, there is a log factor blow-up in 
the universal circuits constructed here. We prove that this construction 
is nearly optimal. 

1 Introduction 

Like resource-bounded universal Turing machines, efficiently constructed univer- 
sal circuits capture the hardness of languages computed by circuits in a given 
circuit class. As a result, the study of the existence and complexity of univer- 
sal circuits for quantum circuit classes provides insight into the computational 
strength of such circuits, as well as their limits. 

There is both a theoretical and a practical aspect to this study. The existence 
of a universal circuit family for a complexity class defined by resource bounds 
(depth, size, gate width, etc.) provides an upper bound on the resources needed 
to compute any circuit in that class. It also opens up possibilities for proving 
lower bounds on the hard languages in the class, as such bounds would follow 
from a lower bound proof for the language computed by a universal circuit family 
for the circuit class. 

More precisely, the specific, efficient construction of a universal circuit for a 
class of circuits yields, for a fixed input size, a single circuit which can be used 
to carry out the computation of every circuit (with that same input size) in that 
family, basically a chip or processor for that class of circuits. The more efficient 
the construction of the universal circuit, the smaller the processor for that class. 
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Furthermore, the universal circuit is in a sense a compiler for all possible 
computations of all circuits in this family. It can be used to efficiently program 
all possible computations capable of being carried out by circuits in this circuit 
class, and in doing so automatically acts as a general purpose simulator and with 
as little loss of efficiency as is possible. 

In the case of quantum circuits there are particular issues relating to the 
requirements that computations must be clean and reversible which come into 
play, and to an extent complicate the classical methods. Still much of our mo- 
tivation for this work originates with classical results due to Cook, Valiant, and 
others [CH85,Val76]. Cook and Hoover considered depth universality and de- 
scribed a depth-universal uniform circuit family for circuits of depth J7(logn). 
Valiant studied size universality and showed how to construct universal circuits 
of size O(slogs) to simulate any circuit of size s. (See Section 1.1.) 

Definition 1 (Universal Quantum Circuits). Fix n > and let C be a 

collection of quantum circuits on n qubits. A quantum circuit U on n + m qubits 
is universal for C if, for every circuit C € C, there is a string x € {0, l} m (the 
encoding) such that for all strings y G {0, 1}™ (the data), 

U(\y) <g> \x)) = C\y) <g> \x). 

The circuit collections we are interested in are usually defined by bounding 
various parameters such as the size (number of gates), depth (number of layers 
of gates acting simultaneously on disjoint sets of qubits), or palette of allowed 
gates (e.g., Hadamard, vr/8, CNOT). 

As in the classical case, we also want our universal circuits to be efficient 
in various ways. For one, we restrict them to using the same gate family as the 
circuits they simulate. We may also want to restrict their size or the number 
m of qubits they use for the encoding. We are particularly concerned with the 
depth of universal circuits. 

Definition 2 (Depth-Universal Quantum Circuits). Fix a family T of uni- 
tary quantum gates. A family of quantum circuits {C/n,d}n,d>o is depth-universal 
over T if 

1. U n _d is universal for n-qubit circuits with depth < d using gates from T, 
2- UnA only uses gates drawn from T, 

3. UnA has depth 0(d), and 

4. the number of encoding qubits of U n ,d is polynomial in n and d. 

Depth-universal circuits are desirable because they can simulate any circuit 
within a constant slow-down factor. Thus they are as time-efficient as possible. 

Our first result, presented in Section 3, shows that depth-universal quantum 
circuits exist for the gate families T = {H, T} U {F n \ n > 1} and T' = {H, T} U 
{F n | n > 1} U {A rl (A) | n > 1}, where H and T are the Hadamard and 7r/8 
gates, respectively, and F n and A„(A) are the (n + l)-qubit fanout and (n+ 1)- 
qubit Toffoli gates, respectively (see Section 2). 



Theorem 3. Depth-universal quantum circuits exist over J 7 and over T' . Such 
circuits use 0(n 2 d) qubits and can be built log-space uniformly in n and d. 

Note that the results for the two circuit families are independent, because 
it is not known whether n-qubit Toffoli gates can be implemented exactly in 
constant depth using single-qubit gates and fanout gates, although they can be 
approximated this way [HS05]. 

It would be nice to find depth-universal circuits over families of bounded- 
width gates 4 such as {H, T, CNOT}. Depth-universal circuits with bounded- 
width gates, if they exist, must have depth J7(log n) and thus can only depth- 
cfficicntly simulate circuits with depth f2(logn). This can be easily seen as fol- 
lows: Suppose all you wanted was a universal circuit U for depth-1 circuits on n 
qubits that use CNOT gates only. Since any pair of the n qubits could potentially 
be connected with a CNOT gate, that pair must be connected somehow (indi- 
rectly perhaps) within the circuit U. Thus any data input qubit can potentially 
affect any of the other n — 1 data output qubits. Since U only has const ant- width 
gates, the number of qubits affected by any given data input increases by only 
a constant factor per layer, and so U must have J7(log n) layers. 

One can therefore only hope to find depth-universal circuits for circuits of 
depth J?(logn) over bounded-width gates. Although such circuits exist in the 
classical case (see below), we are unable to construct them in the quantum case 
(see Section 6). 

1.1 Other relevant work 

The study of quantum circuit complexity was originated by Yao [Yao]. The 
basic definitions and first results in this research area can be found in Nielsen 
and Chuang [NCOO] . Most of the research on universal quantum circuit classes 
deals with finding small, natural, universal sets of gates which can be used in 
quantum circuits to efficiently simulate any quantum computation. Our problem 
and point of view here is quite different. We have the goal of constructing, for 
a natural class C of quantum circuits, a single family of quantum circuits which 
can efficiently simulate all circuits on the class C. In this paper we consider 
classes C which have significant resource bounds (small or even constant depth, 
or fixed size) and ask that the corresponding universal circuits family to have 
similar depth or size bounds. 

Cook and Hoover [CH85] considered the problem of constructing general- 
purpose classical (Boolean) circuits using gates with fanin two. They asked 
whether, given n,c,d, there is a circuit U of size and depth O(d) that 

can simulate any n-input circuit of size c and depth d. Cook and Hoover con- 
structed a depth-universal circuit for depth J?(logn) and polynomial size, but 
which takes as input a nonstandard encoding of the circuit, and they also pre- 
sented a circuit with depth O(lognloglogn) to convert the standard encoding 
of the circuit to the required encoding. 



4 The width of a gate is the number of qubits it acts upon. 



Valiant looked at a similar problem — trying to minimize the size of the uni- 
versal circuit [Val76]. He considered classical circuits built from fanin 2 gates 
(but with unbounded fanout) and embedded the circuit in a larger universal 
graph. Using switches at key vertices of the universal graph, any graph (circuit) 
can be embedded in it. He managed to create universal graphs for different types 
of circuits and showed how to construct a 0(clog c)-size and 0(c)-depth univer- 
sal circuit. He also showed that his constructions have size within a constant 
multiplicative factor of the information theoretic lower bound. 

For quantum circuits, Nielsen and Chuang (in [NC97]) considered the prob- 
lem of building generic universal circuits, or programmable universal gate arrays 
as they call them. Their universal circuits work on two quantum registers, a 
data register and a program register. They do not consider any size or depth 
bound on the circuits and show that simulating every possible unitary operation 
requires completely orthogonal programs in the program register. Since there 
are infinitely many possible unitary operations, any universal circuit would re- 
quire an infinite number of qubits in the program register. This shows that it 
is not possible to have a generic universal circuit which works for all circuits 
of a certain input length. However they showed that it is possible to construct 
an extremely weak type of probabilistic universal circuit with size linear in the 
number of inputs to the simulated circuit. 

Sousa and Ramos considered a similar problem of creating a universal quan- 
tum circuit to simulate any quantum gate [SR07] . They construct a basic build- 
ing block which can be used to implement any single-qubit or CNOT gate on 
n qubits by switching certain gates on and off. They showed how to combine 
several of these building blocks to implement any n-qubit quantum gate. 

1.2 Outline of the paper 

For the rest of the paper, we will use U to denote the universal circuit and C to 
denote the circuit being simulated. We define the quantum gates we will use in 
Section 2. The construction of depth-universal circuits is in Section 3. We briefly 
describe the construction of almost-size-universal quantum circuits in Section 4. 
We mention a couple of miscellaneous results in Section 5. 

2 Preliminaries 

We assume the standard notions of quantum states, quantum circuits, and quan- 
tum gates described in [NCOO], in particular, H (Hadamard), T (n/8), S = T 2 
(phase), and CNOT (controlled NOT). We will also need some additional gates, 
which we now motivate. 

The depth-universal circuits we construct require the ability to feed the 
output of a single gate to many other gates. While this operation, commonly 
known as fanout, is common in classical circuits, copying an arbitrary quan- 
tum state unitarily is not possible in quantum circuits due to the no-cloning 



theorem [NCOO]. It turns out that we can construct our circuits using a classi- 
cal notion of fanout operation, defined as the fanout gate F n : \c,t\,... ,t n ) i— ► 
|c, c ti, . . . , c t n ) for any of the standard basis states |c) (the control) and 
\t\), . . . , \t n ) (the targets) and extended linearly to other states 5 [FFGHZ06]. F n 
can be constructed in depth lg n using CNOT gates. We need to use unbounded 
fanout gates to achieve full depth universality. We also use the unbounded Tof- 
foli gate A n (X) : \c\, . . . , c n , t) i— » \c\, . . . , c n , t © A"=i c i)- We reserve the term 
"Toffoli gate" to refer to the (standard) Toffoli gate A 2 (^0, which is defined on 
three qubits. 

In addition to the fanout gate, our construction requires us to use controlled 
versions of the gates used in the simulated circuit. For most of the commonly 
used basis sets of gates (e.g., Toffoli gate, Hadamard gate, and phase gate S), 
the gates themselves are sufficient to construct their controlled versions (e.g., a 
controlled Hadamard gate can be constructed using a Toffoli gate and Hadamard 
and phase gates). Depth or size universality requires that the controlled versions 
of the gates should be constructible using the gates themselves within proper 
depth or size, as required. 

Definition 4 (Closed under controlled operation). A set of quantum gates 
G = {Gi, . . .} is said to be closed under controlled operation if for each Gi G G, 
the controlled version of the gate C-G,|c)|t) — ► \c)G1\t) can be implemented in 
constant depth and size using the gates in G. Here, |c) is a single qubit and Gj 
could be a single or a multi-qubit gate. 

Note that CNOT = F 1: and given H, T, and CNOT we can implement the 
Toffoli gate via a standard constant-size circuit [NCOO] . We can implement the 
phase gate S as T 2 , and since T 8 = /, we can implement — T 6 and = T 7 . 

A generalized Z gate, which we will hereafter refer to simply as a Z gate, 
is an extension of the single-qubit Pauli Z gate (\x) >— > (— l) x \x)) to multiple 
qubits: 

\xi, ■ ■ ■ , x n ) A (_i)zi*2-zn \ Xu _ _ _ ; Xn y 

A Z gate can be constructed easily (in constant depth and size) from a single 
unbounded Toffoli gate (and vice versa) by conjugating the target qubit of the 
unbounded Toffoli gate with H gates (i.e., placing H on both sides of the Toffoli 
gate on its target qubit). 

Similarly, a Z-fanout gate Z n applies the single-qubit Z gate to each of n 
target qubits if the control qubit is set: 

Mi,--- ,t n ) ^(-i) c -^+-+*»)| c ,ti, ...,*„>. 

A Z n gate can be constructed from a single F n gate and vice versa in constant 
depth (although not constant size) by conjugating each target with H gates. 
So, in our depth-universal circuit construction, we can use cither or both of 
these types of gates. Similarly for unbounded Toffoli versus Z gates. Z gates 



5 This does not contradict the no-cloning theorem as only classical states are copied. 



and Z-fanout gates are important because they only change the phase, leaving 
the values of the qubits intact (they are represented by diagonal matrices in the 
computational basis). This allows us to use a trick due to H0yer and Spalck 
[HS05] and run all possible gates for a layer in parallel. 

3 Depth-universal quantum circuits 

In this section, we prove Theorem 3, i.e., that depth-universal circuits exist for 
each of the gate families 

F={H,T}u{F n \n>l}, 
T' = {H, T} U {F n | n > 1} U {A n {X) \n>l}. 

We first give the proof for T then show how to modify it for T' . 

The depth-universal circuit U we construct simulates the input circuit C layer 
by layer, where a layer consists of the collection of all its gates at a fixed depth. 
C is encoded in a slightly altered form, however. First, all the fanout gates in C 
are replaced with Z-fanout gates on the same qubits with H gates conjugating 
the targets. At worst, this may roughly double the depth of C (adjacent H gates 
cancel). Each layer of the resulting circuit is then separated into three adjacent 
layers: the first having only the H gates of the original layer, the second only 
the T gates, and the third only the Z-fanout gates. U then simulates each layer 
of the modified C by a constant number of its own layers. We describe next how 
these layers are constructed. 

Simulating single-qubit gates. The circuit to simulate an n-qubit layer of single- 
qubit gates of type G, say, consists of a layer of controlled-G gates where the 
control qubits are fed from the encoding and the target qubits are the data 
qubits. Figure 1 shows a layer of G gates, where G € {H, T}, controlled using 
H, S, T, CNOT, and Toffoli gates. To simulate G gates on qubits ii,. . . ,ik, 
say, set Cj 1 , . . . , Ci k to 1 and the rest of the c-qubits to 0. 




Fig. 1. Simulating a layer of single-qubit G gates with controlled G gates. The ancilla 
in the implementation of the controlled T gate is assumed part of the encoding. The 
ancilla is reset to at the end and hence can be reused for implementing all T layers. 



Simulating Z-fanout gates. The circuit to simulate a Z-fanout layer is shown 
in Figure 2. The top n qubits are the original data qubits. The rest are ancilla 
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Fig. 2. Simulating a layer of Z-fanout gates. 



qubits. All the qubits are arranged in n blocks B 1: . . . , B n of n qubits per block. 
The qubits in block Bi are labeled bn, . . . , b in . 

Each Ai subcircuit looks like Figure 3. The qubits Cn, . . . 7 Ci n are encoding 
qubits. The large gate between the two columns of Toffoli gates is a Z-fanout 
gate with its control on the ith ancilla (corresponding to bn and cu) and targets 
on all the other ancillae. 

Here is the state evolution from \d) = \d\ ■ ■ ■ d n ), suppressing the Cij qubits 
and ancillae internal to the Ai subcircuits in the ket labels. Note that after the 
first layer of fanouts, each qubit bij carries the value dj . 

|d,0,...,0) i ^ \d,d,...,d) 

^ (-l)E, *c„(E 3Vi d jCij )^ d ^ 
„ (-l)E, d^{X.^ d jCij )^ 0; . . . , ). 

To simulate some Z-fanout gate G of C whose control is on the ith qubit, 
say, we do this in block Bi by setting cu to 1 and setting Cjj to 1 for every j 
where the jth qubit is a target of G. All the other c-qubits in Bi are set to 0. 
We can do this in separate blocks for multiple Z-fanout gates on the same layer, 
because no two gates can share the same control qubit. Any c-qubits in unused 
blocks are set to 0. 
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Fig. 3. Subcircuit Ai in the simulation Fig. 4. Subcircuit ^4; for a layer of Z 
of Z-fanout gates. gates. 



Simulating unbounded Toffoli gates. We can modify the construction above to 
accommodate unbounded Toffoli gates (gate family T'), or equivalently Z gates, 
by breaking each layer of C into four adjacent layers, the first three being as 
before, and the fourth containing only Z gates. The top-level circuit to simulate 
a layer of Z gates looks just as before (Figure 2), except now each A\ subcircuit 
looks a bit different and is shown in Figure 4, where the central gate is a Z gate 
connecting the ancillae. 

As before, the qubits en, . . . , Cj„ are encoding qubits. The XZX gates on 
cu multiply the overall phase by (— l) Ci% . When the Z gate of A\ is applied, its 
jth contact point is in the state bijCij. Note that bijC^j — bij if c^j — 1 and 
1 otherwise. The Z gate then multiplies the overall phase by (— l)n 3 -(6«cy) _ 
(— l)^ j:c « =1 bl3 . The state thus evolves as given below: 



|d,0,...,0) i-» |d,d,...,d) 

^(-l) E ^ +n ^= lby |d,d,...,d) 
i — ^ (-l) Ei5 " +n ^^= l6y |d,0,...,0) 



To simulate some Z gate G of C whose first qubit is i, say, we do this in block 
Bi by setting cu to 1 and setting to 1 for every j where the jth qubit is 
part of G. All the other c-qubits in Bi are set to 0. As before, we can do this in 
separate blocks for multiple gates on the same layer, because no two gates can 
share the same first qubit. Any c-qubits in unused blocks are set to 0, and it is 
easy to check that this makes the block have no net effect. 



4 Size-universal quantum circuits 



Similar to a depth-universal circuit, a size-universal circuit is a universal cir- 
cuit with the same order of the number of gates as the circuit it is simulating. 
Formally, 

Definition 5. A family {U n . c } of universal circuits for n-qubit circuits of size 
< c is size-universal if SIZE(?7„ iC ) = O(c). 

A simple counting argument shows that it is not possible to obtain a com- 
pletely size-universal circuit for fanin-2 circuits. Consider all circuits with c fanin- 
2 gates where one input of each gate is the first qubit. There are (n— l) c possible 
circuits. Then consider similar circuits where there is no gate with input as the 
first qubit and continue recursively. Thus the number of possible fanin-2 circuits 
is Q{(n— l) c+1 ). Since all the encoding bits have to be connected to some of the 
fanin-2 gates in the universal circuit, it must have J?(clogn) gates. 

We use Valiant's idea of universal graphs [Val76] to construct a universal 
family of fanin-2 circuits that are very close to the afforementioned lower bound. 
As before, we would like to simulate C by using the same set of gates used in C. 
Our construction works for any circuit using unbounded Toffoli gates and any 
set of single-qubit and 2-qubit gates closed under the controlled operation. 

First we will define a universal directed acyclic graph with n special vertices 
(called poles) in which we can embed any circuit with n gates (considering the 
inputs also as gates). The embedding will map the wires in the circuit to paths 
in the graph. 

Definition 6 (Edge-embedding [Val76]). An edge- embedding poiG = (V, E) 
into G' = (V\E') maps V one-to-one to V and maps each edge (i, j) G E to 
a directed path p(i) p(j) in G' such that distinct edges are mapped to edge- 
disjoint paths. 

The graph of any circuit of size n can be represented as a directed acyclic 
graph with vertices {1, . . . ,n} such that there is no edge from j to i for i < j 
and each vertex has fanin and fanout 2. Let r 2 (n) be the set of all such graphs. 

Definition 7 (Edge- universal graph [Val76]). A graph G' is edge-universal 
for r2{n) if it has distinct poles pi,...,p n such that any graph G G r 2 (n) 
can be edge-embedded into G' where each vertex i e G is mapped to vertex 
p(i) =p«eG'. 

Then, Valiant shows how to construct a universal graph. 

Theorem 8 ([Val76]). There is a constant k such that for all n there exists 
an acyclic graph G' that is edge-universal for r 2 (n), and G' has knlgn vertices, 
each vertex having fanin and fanout 2. 

It is fairly easy to construct a universal circuit using the universal graph. 
In fact, the universal circuit for circuits with n inputs and c gates will be any 
edge-universal graph for _T 2 (n + c) . 



Consider any such edge-universal graph G' . Then G' has d = k(n+c) log(n + 
c) vertices for some k. These d vertices include fixed poles p\, . . . ,p n ,Pn+i > ■ ■ ■ ,Pn+c 
and non-pole vertices. Create a quantum circuit C with d gates (including the 
inputs and outputs) where G' describes how the gates connect to each other. For 
each of the vertices p\, . . . ,p n of G', remove their incoming edges and replace 
the vertices by the input as shown in Figure 5. Replace each of the vertices 
Pn+i, ■ ■ ■ ,Pn+c with a subcircuit that applies any of the single- or 2-qubit gates 
on the inputs, where the gate to apply is controlled by the encoding. E.g., Fig- 
ure 7 shows the gates at a pole vertex in a universal circuit simulating CNOT 
and H gates. For a non-pole vertex, replace it with a subcircuit that swaps the 
incoming and outgoing wires (i.e., first input is connected to second output and 
second input is connected to first output) or directly connects them (i.e., first 
input is connected to first output and similarly for the second input). Again, 
the subcircuit is controlled by the encoding which controls whether to swap or 
directly connect (see Figure 6). The edge disjointness property guarantees that 
wires in the embedded circuit are mapped to paths in C which can share a 
vertex but cannot share any edge. 
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gate(v out2 ) 



Fig. 5. The gate for a pole vertex pi is 
mapped to input x t . 
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Fig. 6. The gates at a non-pole vertex 
v. The encoding bit c v specifies if first 
output qubit should be mapped to first 
input or second input qubit and simi- 
larly for second output qubit. 
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Fig. 7. Example of the gates at a pole vertex v simulating a circuit with CNOT and 
H gates. The encoding bits c% specify which kind of gate is at vertex v, and the cjj 
specify which qubit the gate acts on (for H gate) or which is the control qubit (for 
CNOT gate). 



To simulate any fanin-2 circuit C with c gates acting on n qubits, construct 
the edge-universal graph G' for _T 2 (n + c) . Embed the graph of C into G' such 
that the input nodes of C are mapped to the poles p\ , . . . , p n in G' . Now for each 



gate of the circuit, consider the pole to which it was mapped. Set a bit in the 
encoding to denote the type of the gate at that pole. For the non-pole vertices, 
set a bit in the encoding to specify whether the two input values should be 
swapped or mapped directly to the two output values. The size of the encoding 
is (n+c) (log |gates| + l) + (|.T2(n+c)| — (n+c)) which isO(clogc) for polynomial- 
size circuits. This construction gives us a universal circuit with a logarithmic 
blow-up in size. 

Theorem 9. There is a constant k and a family of universal circuits U n>c that 
can simulate every circuit with c gates acting on n qubits such that SIZE(U n , c ) = 
k(n + c) log(n + c). 

We can use a similar idea for circuits with unbounded fanin. First we decom- 
pose the unbounded fanin gates using bounded fanin gates (fanin 2 in this case). 
This is doable for most of the common unbounded fanin gates. For example, an 
unbounded Toffoli gate of size / can be constructed using 0(f) successive Toffoli 
gates of size 3, which can in turn be implemented using Hadamard, phase, tt/8 
and CNOT gates [NCOO]. So any circuit of size c consisting of Hadamard, tt/8 
and unbounded Toffoli gates can be transformed into an equivalent circuit with 
size at most 0(cn) consisting of these single-qubit gates and CNOT gates. The 
rest of the construction follows as before. 

Corollary 1. There is a family of universal circuits U nyC that can simulate 
quantum circuits of size c on n qubits and consisting of Hadamard, tt/8, and 
unbounded Toffoli gates such that SIZE(U n ,c) — 0(nclog(nc)) . 

5 Other results 

Circuit encoding. We have been mostly concerned with the actual simulation of 
a quantum circuit C by the universal circuit U. It is possible, however, to hide 
some complexity of the simulation in C/'s description of C itself. Usually, the 
description of a classical circuit describes the underlying graph of the circuit and 
specifies the gates at each vertex. We can similarly describe a quantum circuit by 
its graph structure. The description is extremely compact with size proportional 
to the size of the circuit. However, we use a description that is more natural for 
quantum circuits and especially suitable for simulation. The description stores 
the grid structure of the circuit; the rows of the grid correspond to the qubits, 
and the columns correspond to the different layers of the circuit. This description 
is not unique for any given circuit and its size is 0(nd), where n is the number 
of qubits and d is the depth of the circuit. A graph-based description can be 
easily converted to this grid-based description in polynomial time. 

Depth-universal classical circuits. The techniques of Section 3 can be easily 
adapted to build depth-universal circuits for a variety of classical (Boolean) 
circuit classes with unbounded gates, e.g., AC, ACC, and TC circuits. The key 
reason is that these big gates arc all "self-similar" in the sense that fixing some 
of the inputs can yield a smaller gate of the same type. We will present these 
results in the full paper. 



6 Open Problems 



A number of natural, interesting open problems remain. 

Fanout gates are used in our construction of a depth-universal circuit family. 
Is the fanout gate necessary in our construction? We believe it is. In fact, we 
do not know how to simulate depth-d circuits over {H, T, CNOT} universally in 
depth 0{d) without using fanout gates, even assuming that the circuits being 
simulated have depth i?(logn). The shallowest universal circuits with bounded- 
width gates we know of have a lg n blow-up factor in the depth, just by replacing 
the fanout gates with log-depth circuits of CNOT gates. 

Our results apply to circuits with very specific gate sets. How much can 
these gate sets be generalized? Are similar results possible for any countable set 
of gates containing Hadamard, unbounded Toffoli, and fanout gates? 

We showed how to contruct a universal circuit with a logarithmic blow-up in 
size. The construction is within a constant factor of the minimum possible size for 
polynomial-size, bounded-fanin circuits. However for constant-size circuits, we 
believe the lower bound can be tightened to match the proven upper bound. For 
unbounded-fanin circuits, we construct a universal circuit with size 0(nc log nc) 
which is significantly larger than the bounded fanin lower bound of J7(clogn). 
We think that a better lower bound is possible for the unbounded-fanin case. 
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